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Reversibility in Massive Concurrent Systems 


Luca Cardelli!, Cosimo Laneve? 


Abstract 


We introduce reversible structures, an algebra for massive concurrent 
systems, where terms retain bits of causal dependencies that allow one 
to reverse computation histories. We then study the implementation 
of (weak coherent) reversible structures in three-domains DNA strands, 
which is the natural model that has inspired reversible structures. 
We finally provide schemas for modeling significant synchronization 
patterns of process algebra into reversible structures and discuss the 
encoding of asynchronous Reversible CCS. 

Keywords: Reversible algebra, DNA strands, computational histories, 
synchronization patterns, encodings. 


1 Introduction 


Reversing a (forward) computation history means undoing the history. In 
concurrent systems, undoing the history is not performed in a deterministic 
way but in a causally consistent fashion, where states that are reached 
during a backward computation are states that could have been reached 
during the computation history by just performing independent actions in a 
different order. In Rccs [5], Danos and Krivine achieve this by attaching 
a memory m to each process P, in the monitored process construct m: P. 
Memories in RCCS are stacks of information needed for processes to backtrack. 
Alternatively, Phillips and Ulidowski propose a technique for reversing process 
calculi without using memories [16]. In this technique, the structure of 
processes is not destroyed and the progress is noted by underlining the actions 
that have been performed. In order to tag the communicating processes, 
they generate unique identifiers on-the-fly during the communications. 
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These foundational studies of reversible and concurrent computations 
have been largely stimulated by areas such as chemical and biological systems 
— called massive concurrent systems in the following — where operations are 
reversible, and only an appropriate injection of energy and/or a change of 
entropy can move the computational system in a desired direction. 


However there is a mismatch between chemical and biological systems 
and the above concurrent formalisms. In the latter ones, reversibility means 
desynchronizing processes that actually interacted in the past while, in massive 
concurrent systems, reversibility means reversibility of configurations. In 
order to make massive concurrent systems reversible with the process calculus 
meaning, one has to remember the position and momentum of each molecule, 
which is precisely contrary to the well-mixing assumption of biochemical 
soups, namely that the probability of collision between two molecules is 
independent of their position (cf. Gillespie’s algorithm [8]). 


We introduce an algebra for massive concurrent systems, called reversible 
structures, where terms retain bits of causal dependencies that allow one 
to reverse computation histories. These bits permit to trace effects of 
interactions, but not to the point of being able to identify the precise molecule 
that caused an effect. For example, it is not possible to determine the signal 
that causes a reduction among the many several of the same population. It 
is worth to remark that, in reversible structures with populations of species 
that are singletons (called coherent reversible structures in [4]), causality 
has a meaning that is consistent with that of standard process calculi [5, 16]. 
While these latter structures are not currently realizable, they may become 
realizable in the future if we learn how to control individual molecules. 


Reversible structures may implement significant CCS-style interaction 
patterns (Cardelli already noticed this by studying a class of reversible 
systems — the DNA chemical systems [2, 3]). Consider for example a binary 
operator that takes two input molecules and produces one unrelated output 
molecule when (and only when) both inputs are present. It is too difficult to 
engineer the input machinery in order to account for any possible pattern of 
interaction, and to produce the output molecule out of their own structure. 
This operator is therefore implemented by an artifact that binds the two 
inputs one after the other and then releases the output out of its own 
structure. Of course, if the second input never comes, the structure must 
release the first input, because the first input may be legitimately used by 
some other operator. This means that the binding of the first input must 
be reversible, and the natural reversibility of our structures is exploited to 
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Figure 1: A DNA domain 


achieve correctness. 

These remarks allow us to draw a precise measure of the expressive 
power of reversible structures. In fact we compile a sub-calculus of Rccs [5], 
its asynchronous fragment, into reversible structure and demonstrate a strong 
correspondence between causalities in the two formalisms. 


Structure of the paper. In Section 2 we overview the model that inspired 
our algebra: DNA three-domains strands and their dynamics. In Section 3 we 
define reversible structures and discuss a few properties of these structures. 
In Section 4 we study the encoding of reversible structures in DNA circuits. 
In Section 5 we model standard synchronization patterns in our formalism. 
In Section 6 we study the compilation of asynchronous RCCS [5] in reversible 
structures. We conclude in Section 7 by discussing the theoretical results 
in [4] and by outlining some future work. 

This paper contains an introduction to reversible structures by discussing 
the motivations that led to their definition and by establishing a precise 
relationship between reversible structures and reversible process calculi. The 
purpose of the companion paper [4] is to provide a detailed presentation of 
the theory developed to date and to discuss algorithmic issues of decision 
problems. 


2 Three-domains DNA strands and causality 


There are many ways of computing with DNA structures. They all use the 
Watson-Crick complement that we briefly discuss. 

DNA strands are sequences of bases (Adenine, Cytosine, Guanine, and 
Thymine). There are subsequences of them, called domains, that are indepen- 
dent of each other and cannot hybridize from any other domain except the 
sequence consisting of complementary bases (Adenine is complementary to 
Thymine, Cytosine is complementary to Guanine). In Figure 1 we illustrate a 
strand of three domains, that have different names because the corresponding 
sequences of bases are different. Single strands have an orientation; double 
strands are composed of two single strands with opposite orientations, where 
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Figure 3: Reversible branch migration 


the bottom strand is the complement of the top strand. The “short” do- 
mains, called toeholds and depicted in red in the following figures, hybridize 
(bind) reversibly to their complements, while the “long” domains hybridize 
irreversibly; the exact critical length depends on physical condition. Figure 2 
illustrates the hybridizations where distinct letters indicate domains that do 
not hybridize with each other. 

An additional fundamental mechanism, called toehold mediated branch 
migration [18], allows displacements of strands composed of long- and short- 
domains in a reversible way, as illustrated in the leftmost picture of Figure 3. 
In the first reaction of Figure 3, a toehold ¢ initiates a binding between 
a double strand and a single strand. After the (reversible) binding of the 
toehold, the x domain of the single strand gradually replaces the top x strand 
of the double strand by branch migration. The branching point between 
the two top x domains performs a random walk that eventually leads to 
the displacement of the x strand. If the toehold matches but the branch 
migration region does not match, then the signal will eventually break off 
from the gate, as if nothing had happened. The last reaction unbinds the 
rightmost toehold of the double strand, thus obtaining a single strand z:t. 
The whole process may be reverted by binding the toehold of the single 
strand to the rightmost toehold of the double strand. 

In this paper we consider a somewhat different subset of DNA strands 
(the reasons are given in the following discussion about causal dependen- 
cies), which is a refined version of the “see-saw gates” model of Quian and 
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Winfree [17]. Our DNA system consists of signals and gates. A signal is 
encoded as a single-stranded DNA sequence consisting of three contiguous 
domains as illustrated in Figure 1. The first domain is a “history” domain 
arising from previous interactions: it can be in principle arbitrarily long and 
is not part of the signal identity. That is, two signals are equal provided 
they are equal in the second and third domains, ignoring the history domain. 
The second domain is a toehold: it initiates the signal processing. The 
third long-domain is a branch migration domain: it stabilizes the interaction 
initiated by toehold binding. 

Branch migration, like toehold binding, is also a reversible reaction: 
it is a bidirectional random walk, zipping and unzipping complementary 
DNA strands. Although there is no directionality, one can arrange that 
when branch migration randomly reaches one end of a region, it causes 
another strand to detach. If that detached strand has a toehold to go 
back, it can then start reversible branch migration again, and we have a 
reversible overall reaction: this whole process is called toehold exchange. As 
an example we discuss the simplest DNA gate: a reversible signal transducer. 
A transducer ~a.b takes an input signal a (a fixed input) and produces 
an output signal b becoming a.b*. The expected reduction semantics is 


a@|ca.b << a.b* | b. We model *a.b and a.b* with the structures 


ee «tt ——— 
bs ~ . 
> = - aA ee — = 
a V a Vv 
-a.b a.b* 


and we discuss the behaviour of *a.b when a signal with name a binds to 
it (we assume toeholds are always complementary). In Figure 4, leftmost 
picture, the toehold of the signal binds to the complementary toehold of the 
gate. Then branch migrations starts, where the domain a of the signal and 
the upper strand a of the gate compete for the lower strand. When branch 
migration (by a random walk) reaches the right end, the only thing holding 
the a domain is the second toehold of the gate, which can hence detach — see 
the picture in the middle of Figure 4. Still, this toehold can reattach, and 
the branch migration may then reach the left end causing the signal a with 
history wu to detach. Therefore, this toehold exchange is fully reversible. It is 
also possible, however, that the single strand labelled v attaches to the gate 
while a is detached, see the rightmost picture of Figure 4. This will displace 
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en 4: The dynamics of a transducer: the leftmost solution represents 
a | *a.b, the rightmost one represents a.b* | 6 


a signal b leaving unbound the rightmost lower toehold of the gate. Because 
of this, the signal may bind again to the gate, thus reverting the behaviour. 

Three-domains DNA strands carry bits of information that often allow 
one to reverse computations in a causally consistent fashion. To illustrate 
the question, consider the solution (written in process algebraic form for 
simplicity) 


and the computation 


o |b | "ace |*bce|*e:d |"ece = & |e | awve | "bee | ~ezd (“ee 1) 
> b|d|a.e@ | *b.t|c.d* | *c.€ (2) 
— é@|d|a.e | b.e@ | c.d* | *c.e (3) 
— d|eé|a.e@ |b.@ | c.d* | c.e (4) 


where the c signal of the transducer *a. has triggered the transducer ~c.d 
and the c signal of the transducer ~b.€ has triggered the transducer *c.€ 
This computation may be reversed in different causally consistent ways. One 
of its is to reverse the reductions (2) and (4) because independent (they 
concern different terms): 


a |2| ace | he led las 


In this last solution @ | @ | a@.@* | b.@* | *c.d | *c.@ it is not possible 
to determine the c that caused either the signal d or € because biological 


systems are massively concurrent. Therefore one ends with identifying the 
above computation with 


a|b | “a.¢ | “bse | “aid | “eee 


+ b|c|a.a@ | *b.@| *e.d| *c.2 
+ 6|2@|a.@ | *b.c| *e.d | c.e 
+ ¢|2@|a.@ | b.e | *ce.d | c.e 
+ d|e@|a.e@ | b.e@ | c.d* | c.& 


It is worth noticing that such identities may become troublesome as soon 
as different causal dependencies produce different visible effects, such as 
different colors of the solutions. 
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Figure 5: Causality problems 


A standard solution to this problem is to record the strands that 
synchronized and the literature reports several techniques to achieve this [11, 
1, 5, 16, 9]. Our three-domain structures implement the technique proposed 
by Lévy [11] that, in the above example, amounts to using signals c with 
different histories according to whether they are produced by the transducer 
“a.éor ~b.¢é. In facts, the reader may remark that there is no mixing of 
causal dependencies in the solution 


Vv : WwW : a 
eee Gag eee ee 
eee a Vv c w 
— — 
a P ———% F ae 
ey ee ee 
b v’ c Ww 
—— — 


that encodes a@ | b | *a.@ | ~b.@ | *c.d | *c.é@ in three domain structures. 
However, mixing of causalities is still possible in three-domain DNA 
strands because of massive concurrency and because of bad designs. In 
Figure 5, left picture, it is not possible to determine if the signal b has 
been produced by the lower or the upper gate, because they belong to the 
same species. As we said, this kind of confusion is unavoidable in massive 
concurrent systems. In the right picture of Figure 5, again, the signal b may 
bind either to the lower or the upper gate, even if they do not belong to 
the same species. In this case the situation is worse because the designer 
has used the same history id in two different gates. We notice that these 
solutions may be banned by a simple static verifier enforcing that different 
species retain different history ids (called weak coherence, see Section 3). 
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3. The algebra of reversible structures 


In this section we define a process algebra, called reversible structures, for 
the three-domains DNA strands discussed in the previous section. We use 
three disjoint infinite sets: names N,, ranged over by a,b,c,---, co-names 
N, ranged over G,b,é,---, and a countable set of ids, ranged over wu, v, w, 
--», Names and co-names are ranged over by a,a’,--- and @ = a. Names 
and ids are ranged over 2, 2’, ---. The following notations for sequences of 
actions will be used: 


— sequences of NV are ranged over by A, B, --:; 


— sequences of elements u:@ are ranged over by A, B, ---; 


— sequences of elements u:a are ranged over by A+, B+, ---; 


The empty sequence is represented by ¢; the length of a sequence is given by 
the function length(-). 

The syntax of reversible structures includes gates g and structures S 
and consists of the rules: 


GS Ale TB e (length(A+.B) > 0) 

| A+.B.7C (length(A+) > 0) 
Sons 

0 (null) 

[| aia (signal) 

| g (gate) 

[! <8 -[48 (parallel) 

| (newx) S$ (new) 


A gate is a term that accepts input signals u:a and emits output signals, 
reversibly. The form A- . *B.C represents input-accepting gates, at least when 
not considering reverse reactions. A+ are the inputs that have been processed, 
B are the inputs still to be processed, and C are the outputs to be emitted. 
The other form A+.B.~C represents an output-producing gate (when not 
considering reverse reactions). The A+ is as before, B are the outputs that 
have been emitted, and C are the outputs still to be emitted. Since all the 
inputs in a gate have to be processed before the outputs are produced, we 
do not need to consider other forms. In both forms, the symbol ~*, called 
gate pointer, indicates the next operations (one forward and one backward) 
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that the gate can perform. A structure may be either a void structure 0, 
or a signal u:@ denoting an elementary message a with an id u, or a gate 
g, or a parallel composition “|” that collects gates and signals and allow 
them to interact. A structure may also be (new x) S that limits the scope 
of a name or id z to §; = is said to be bound in (new x) S. This is the only 
binding operator in reversible structures. 

For example, the transducer depicted in Figure 4 (leftmost structure) 
is defined by u:a | *a.v:b. This solution may evolve into u:a.*vu:b by 
inputting the signal u:a, see the middle structure in Figure 4. At this stage, 
a signal v:b may be emitted, thus becoming u:a.v:b* (rightmost structure 
in Figure 4) or may backtrack to *a.v:b by releasing the signal u:@ (see the 
following semantics). Another example is a sink gate, such as ~a.b, that 
collects signals (and, in a stochastic model, may hold them for a while). This 
gate may evolve into u:a.~b, and then may become u:a.v:b*. 

We often abbreviate the parallel of S; for i € I, where J is a finite set, 
with |],-;S;. We write (new 71,--- ,@,) S for (new 21) --- (new rp) S, n > 0, 
and sometimes we shorten 71,--- , 2%, into x. The free names and ids in §, 
denoted fn(S), are the names and ids in S with a non-bound occurrence. 

Structures we will never want to distinguish for any semantic reason are 
identified by a congruence. Let =, called structural congruence, be the least 
congruence between structures containing alpha equivalence and satisfying 
the abelian monoid laws for parallel (associativity, commutativity and 0 as 
identity), and the scope laws 


(new xr) 0O=0 (new x) (new 2’) S = (new 2’) (new z) S, 
S| (newx) S’'=(newx) (S| 8’), if x ¢ fn(S) 
It is easy to demonstrate the following property. 
Proposition 1 For every8, 8 = (new £) ([[ic7 9 | [jes uj:@j). The struc- 
ture (new ©) ([[je7 % | ies uj:a;), which is unique up-to alpha equivalence, 


the order of names and ids in the sequence x, and the order of gates and 
signals, is called the normal form of S. 


The semantics of reversible structures is defined operationally by means 
of a reduction relation. 
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Definition 1 The reduction relation of reversible structures is the least 
relation —> satisfying the axioms 


(input capture) ua | At.7a.B.C —> At.u:a.7B.C 
(input release) AL.u:a.7B.C —> u:a@ | At.7a.B.C 
(output release) AL.B.cwa.C —> u:a@ | AL.B.u:a.7C 
(output capture) u:a | AL.B.u:a.7C —> AL.B.*u:a.Cc 
and closed under the rules 
s—s’ s—s’ 
(new a) S —> (new a) S’ s | s”—>s' | s” 


S:=s, Si — Sh Sh =Se 
31 — 82 


Sequences of reductions, called computations, are noted —*. 


The reductions (input capture) and (output release) are called forward reduc- 
tions, the reductions (input release) and (output capture) are called backward 
reductions. 

The axioms of reversible structures semantics are explained below by 
discussing the reductions of the transducer *a.v:b when exposed to signals 
u:@ and w:a@. The transducer may behave either as u:a@ | w:@ | *a.v:b —> 
w:d@ | u:a.*v:b or as ua | w:a | “a.v:b —> u:a@ | w:a.*v:b according to 
whether the axiom (input capture) is instantiated either with the signal u:@ or 
with w:a@ — in these cases A+ is empty. In turn, w:@ | u:a. *v:b may reduce with 
(output release) as w:a@ | u:a.*v:b —> w:@ | u:a.v:b* | v:b or may backtrack 
with (input release) as follows w:@ | u:a.*v:b —> wa | w:@ | *a.v:b. 
This backtracking is always possible in our algebra. In fact, it is a direct 
consequence of the property that, for every axiom S —> S$’ of Definition 1, 
there is a “converse one” S’ —> S. 


Proposition 2 For any reduction S —> S$! there exists a converse one 
s’ —> S$. 


We notice that, *a.u:b | v:@ | *a.u:b = v:a@ | *a.u:b | *a.u:b (and 
similarly for every permutation of gates and signals). In these structures, 
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the two occurrences of *a.u:b are indistinguishable, that is it is not possible 
to identify the precise gate ~a.u:b that performs the reduction *a.u:b | 
vid | *a.u:b — v:a.*u:b | *a.u:b. This feature formalizes the well-mixing 
assumption of chemical solutions, namely that the probability of collision 
between two molecules is independent of their position. This is also the 
main difference between our model and reversible process calculi models 
as [5, 15], where every element has a unique tag. We finally notice that, as a 
consequence of the above identities, the notions of causality and independence 
of reductions are different from [5, 16] because different molecules of the 
same chemical species are indistinguishable in our setting. 

By Proposition 1 and the definition of the reduction relation, it is 
possible to restrict the arguments about the dynamics of reversible structures 
to structures in normal forms. In turns, the following statement allows one 
to limit the analysis to the subclass of structures without news when the 
interest is in computations of “closed” structures, namely structures that do 
not interact with the external environment. (This simplifies the following 
notion of weak coherence.) 


Proposition 3 (new) (Tier 9 | jes) —> (new ®)(TTier 9 | 
Ter W205) if'and only if Tez 9 | jes 49%) — Tier 9f | yew 0h) 


In the following, if not otherwise specified, the structures will be con- 
sidered without news. 


Definition 2 A structure S is weak coherent whenever ids are uniquely 
associated to names and co-names. That is, if u:a and u:a’ occur in S then 
either a=a' ora=a’. 


For example, the structure u:a.v:b7 | v:¢ is not weak coherent because v is 
associated to two different co-names, while u:a.vu:b* | v:b is weak coherent. 
Weak coherence is an invariant of the reduction relation. 


Proposition 4 [fS is weak coherent and S —> S' then S' is weak coherent. 


4 The compilation of reversible structures into three- 
domains DNA strands 


In this section we detail the implementation of weak coherent reversible 
structures into three-domains DNA strands. We have already presented these 


186 L. Cardelli, C. Laneve 


(Gad on semen 


(migration) 


mneoeee 


(p-binding) cprefix(x,y) > 0 


Figure 6: Dynamics of DNA strands 


strands in Section 2. Here we complete the presentation by discussing the 
semantic rules in Figure 6. 


A DNA solution is a multiset of strands that may be either single two- 
domains toehold/long-domain or single three-domains long-domain/toehold/ 
long-domain or double strands. The double strands are (i) composed of two 
single strands with opposite orientation, where the bottom strand is the 
complement of the top strand; (ii) are toehold-mediated, namely they are 
sequences of alternating toeholds and long-domains. We assume there is a 
unique toehold in strands; therefore complementary toeholds always match. 
The definition of DNA solution is purposely left informal because below we 
only consider a subclass of solutions implementing reversible structures. 


The dynamics of DNA solutions is defined in Figure 6. Rule (binding) 
models toehold hybridizations between a double strand with an hole in the 
upper strand, in correspondence of a toehold and a signal. The dotted lines 
represent domains that may miss so, technically, the rule is a schema. Rule 
(migration) defines branch migrations of matching long-domains, while rule 
(p-binding) defines partial branch migrations of mismatching domains. In 
this latter case the bases may hybridize until the longest matching prefix of 
the domains and then may unbind. Formally, there is a function cprefiz(z, y) 
from pairs of names and ids to naturals that defines the largest matching 
prefix between them. This function is not defined on pairs of identical names 
or of identical ids. The rule (p-binding) is applied provided cprefix(x, y) is 
positive. 
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Without loss of generality, by Proposition 3, we restrict the following 
discussion to structures without new. The encoding ¢ - ) of reversible 
structures to DNA strands is homomorphic with respect to parallel, is such 
that € 0 ) = @, and it is defined on signals and gates as follows (for gates 
we only illustrate the encodings of configurations of a1 . a2. v1:b, . v2:b2): 


€ 7a, .a2.v1:b1.v2:b2 ) = 


—= —— a1 az V1 V2 
€ ujiay . 72.110, .vaibg D = 
a1 a2 
a % 
i is —_—-—>. b 
ie, nccentiee™ 
ay a2 V2 


—_ 


€ uj ia] . U2:a2. v1:b1 . 7 ve:be ) 


The strict correspondence between reversible structures and DNA three- 
domains strands is fixed by the following statement. Let T and T’ be two 
DNA solutions. We write T =,,, T’ if and only if T —>* T’ with rules (binding) 
and (p-binding). By reversibility, the relation =), is symmetric. 


Theorem 1 S —> S’ implies CS ) —>* (8! ). Additionally, if S is weak 
coherent and © S ) —>* T then there is S’ such that T =byp © sy. 


Proof: The proof of S —+ S’ implies € S ) —>* €S’ ) is asimple case anal- 
ysis on the axiom used in S —> S’. Figure 4 illustrates the correspondence 
when the axiom is an (input capture) (in the simple case of a transducer). 
The analysis of the other axioms is omitted. 

Let S be weak coherent. The proof that ¢€ S ) —>* T then there is S’ 
such that T =), € S’ ) is by induction on the number of rules (migration) 
used in ( S ) —>* T. The case when this number is 0 is obvious. Assume 
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the statement holds when there are n rules (migration), let us prove the case 
n+1. So the computation may be split into 


CS) —*T] — To '* T 


where T; —> Tp is the n + 1 reduction due to an instance of (migration) 
and Tz =p,, T. By inductive hypotheses, there is T such that T;) —>* Tj 
with rules that are instances of (binding) and (p-binding) and such that 
T, = (Sj D, for some S$. In case of strands obtained by the encoding € - ) 
there are four possible types of reduction T; —> Tg: 


In case (i), according to the signals that are present in the solutions, the 
upper domain a must be the final part of a signal starting with an id, let’s 
say u, and having a toehold in between — the signal is u:a. Next, take the 
reverse of T; —>* Tj, let it be T| —>* Tj, and consider the computation 
T, — T{ — TY’ where the first reduction is the (binding) of the toehold of 
u:a and the second one is the (migration) of the domain a. By definition 
Ti’ = CS’ ) where S{ —> S’ is an instance of (input capture). It remains to be 
proved that ( S’ ) —>* T2 and then, by reversibility, we obtain T =,,, CS’ D. 

Consider the computation ¢ 8’ ) —+* TS obtained by performing the 
sequence of reductions of T; —>* T, and skipping those concerning the signal 
u:a and the toehold of the double strand involved in T; —> TY —> T{’. We 
observe that 


1. in T the context of the signal u:a and the toehold of the double strand 
involved in T, —> T{ —> T/’ are the same as in Tz because reductions 


of DNA strands are context-free; 


2. the configurations signal u:a and the toehold of the double strand 
involved in T; —> TY —> Ti! are the same in T) and T» because (i) 
T, —>* T1, by definition of T,, must have an odd number of (binding) 
rules concerning the toehold of the signal u:a since the toehold initially 
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is unbound and at the end is bound (otherwise the rule (migration) 
cannot occur); (ii) the effects of reverse reductions on a solution are 
void. 


Therefore TS = To. 

The other cases are omitted because similar; we only observe that (ii) 
corresponds to an instance of (input release, (iii) corresponds to an instance 
of (output release), and (iv) corresponds to an instance of (output capture). 
| 


The second part of Theorem 1 is restricted to weak coherent structures. 
In fact, (S ) —>* CS’) implies S —>* S’ is false in the unrestricted 
case. Consider the above encoding of the gate uj1:a1 . ug:a2 . vy2b1 . ~veib2 
and observe that, in the DNA strand, the co-name b; never appears. If the 
(not weak coherent) structure also contained the signal v;:¢ then the DNA 
strand might reduce to the encoding of u4:a1 . uz:a2 . *v1:C. v9:b2. However 
this gate cannot be obtained from the structure uy:a1 . ugiag . v1 2b) . *v22b2. 
It is worth noticing that there is a way for removing the constraint of weak 
coherence in Theorem 1. The technique associates different toeholds to 
different names — that is the toeholds record the identity of names. While 
this method works fine for solutions with few names, it is not practicable in 
general because toeholds are usually very few with respect to (long domains) 
names. We also notice that the three-domains structures used by Quian 
and Winfree [17] have the output parts of gates without ids. It is easy to 
verify that such a model identifies more computations (i.e. mixes causal 
dependencies) than our model. 

We finally remark that Theorem 1 establishes a relation between “mean- 
ingful” reductions of the DNA solution and reductions of reversible structures, 
where “meaningful” means those reductions obtained with rules (migration). 
In facts, rules (binding) and (p-binding) must be considered as bureaucracies 
used to prepare the solution for long-domains hybridizations. 


5 Modelings 


Despite their simplicity, reversible structures are quite expressive. In this 
section we discuss a number of synchronization patterns that are standard 
in process calculi and detail their modeling in reversible structures. As we 
will see, in every case, reversibility plays a basic role. 
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Join patterns. Join patterns have been introduced in join-calculus [7]. 
Here we consider the so-called “zero-adic” version where channels carry no 
message. Join patterns, written a ,&---&a,, > bi|---|b, reduce as follows 


ar |<< | Ge | Rika bb lee —S |< |, 


that is a join pattern a,&---&a,, triggers provided all the messages Gj, ---, 
Gm are available in the solution. The rule specifies that all the messages are 
grabbed at once — an all-or-nothing requirement — and all the messages bj, 
ree, bp, are released at once. 

The modeling of the above join pattern in reversible structures is given 
by the term 


(new U1,--* Un) (7a, . +++ © Gm. Uz:by. «++ . Unibn) 


There is a difference between the semantics of this term and that of join 
patterns that turns out to be irrelevant. In the above terms, messages @1, 
+++, Gm are taken in order: first a,, then a2, and so on. It is possible that 
one takes a, and a and then realizes that there is no a3 in the solution — a 
circumstance that never occurs in join-calculus. However this is not an issue 
because (i) the input capture is reversible, therefore the messages a; and ag 
may be released in the solution, and (ii) messages/signals are asynchronous 
— they have no continuation — and therefore no (continuation) process has 
been triggered. Additionally, because of asynchrony, releasing messages by, 
-+, bp in order or all at once is semantically the same. 


Mixed choice. Mixed choice is a standard operation in process calculi 
(see, for instance, CCS [12] or pi calculus [13, 14]) that allows the progress of 
exactly one among several processes. The operator, in case of asynchronous 
calculi, is usually written >/j<) ai-bi + ))j¢7 G and the semantics is defined 
by one of the following rules: 


a | doer 4-0; + Deg GF —> be (k € I) 


Ck | ier 4-0 + ie G — 0 (ke J) 


In particular, the semantics excludes communications between two branches 
of the choice. The above mixed choice is modelled in reversible structures 
by the term 


(new v, e, ui! wj*7) ([ [e- ai. wads | lle. wie | w:é ) 
ie] fed 
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that implements choice with parallel composition. Every gate of the above 
term is prefixed by an input on the name e that is local to the term. The 
correctness of the modeling follows by the properties (i) there is at most one 
gate that progresses because of the presence of exactly one signal v:é; (ii) 
if the chosen gate is one of e. a; . u;:b; and the solution does not contain a 
signal aj, it is possible to revert the decision and select another branch. 


Smooth orchestrators. Smooth orchestrators, introduced in [10] for 
modeling synchronization patterns in web services, combine join patterns and 


(input-guarded) choices. A smooth orchestrator is a term )),<; a &--- kai, > 


bi|--- [di with the semantics defined by (k € I) 


—k —k rf ia Tk Tk 
AE | T | Doak tal, > Bil (BR RL | BR, 
tel 


The modeling of such operator in reversible structures is a byproduct of the 
above encodings: 


i i iel i i api a i a 
(new v, €,Uj,°"* Un, ) ([[e-ai. tt Bir 4 Uy 04 PP ADE. | USE ) 
tel 


and correctness follows with arguments similar to the ones above. 


6 The encoding of asynchronous RCCS into re- 
versible structures 


In this section we give a precise assessment of the expressive power of re- 
versible structures, by discussing the encoding of a process calculus with 
a reversible transition relation: the asynchronous RCCS [5, 6]. As a conse- 
quence, it is possible establish properties of asynchronous RCCS using those 
of reversible structures. (See for example the Standardization Theorem in [4], 
which has been proved for RCCS in [5].) 

The syntax of asynchronous RCCS uses an infinite set of names, ranged 
over by a,b,c,---, and a disjoint set of co-names, ranged over by G, b,Z,+--. 
Names and co-names are ranged over by a, (,--- and are generically called 
actions. Processes P,, Q, --- are defined by the following grammar (the set 
I is always finite): 


P:= 0 | Yiera-P | Tier | (newa) P 
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The term 0 defines the terminated process; }),-; a;.P; defines a process that 
may perform one action a; and continues as P;; [],<; P; defines the parallel 
composition of processes P;; finally the term (new a) P defines a name with 
scope P. Processes meet the following well-formed conditions: 


— in a@.P, the process P is 0 (continuations of co-names are empty); 
— in [lie 1 the processes P; are guarded choices. 


The semantics of asynchronous RCCS is defined in terms of a transition 
relation that uses 


— memories ™m: 


— run-time processes R: 
R= meP | -A | R|. mew) A 


— structural congruence =, defined in the standard way (see Section 3), 
plus the rules 


m> (TTier.n Pi) = Mietn (ine me P 
mb (newa) P = (newa) (mp P) (a € fn(m)) 


The reduction relation —> is the least relation on run-time processes satis- 
fying the axioms: 


—~ mp(a.P+Q) | m’>p(@+ R) — (m’',a,Q) emp P | (m,a, R) em’? 0, 
— (m',a,Q)empP | (m,a, R)em'>0 — mp (a.P+Q) | m'>(G@+R), 


and closed under the contextual rules for parallel, new and structural con- 
gruence. 


Our encoding (whose main ideas have been already discussed in Sec- 
tion 5) uses environments T that map memories to signals u:¢ such that: 


1. (T is injective) if m ¢ m’ then the signals ['(m) and I'(m’) are different 
(they have different ids and co-names); 


2. ([ is suffix-closed) if sem € dom([) then m € dom(T); 
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3. ([ is branch-closed) if (i), em € dom(L) then (1), em,--+- ,(n)neme 
don(T’); 


4. ([ is depth-closed) if (m’,a,P) em € dom(IT) then m’ € dom(L). 


Let fn(T) a {a | 


a tit ths (ine) aia | 


Let [ be an environment such that, for every i € I, T(m;) = uj;:G, for 
some u;. The encoding [-]' uses a number of fresh names and fresh ids and 
is defined by 


[iermi> PIT = Tlier((Pile, | T(mi)) | Umi | ie 1,1) 


where P; are guarded choices and where the auxiliary functions [P]. and 
Uu(M,T) are defined in Figure 7. 


Lemma 1 If R — R’ then, for suitable T and I’, [RJ! —* [RJ], with 
id(u,) 1 id(T(m) | T(m’)) #9. 


Proof: Without loss of generality, let 


R=mv>-a.Pj+ > ay | mp5 °b.Qi+ >); | mp PY 


ie] jeJ ier’ ged! 


and let G = bk =aand P= Vieni G-Pitd jes G and Q! = Vey bi.Qit+ 
De ies\{h} Oj. Then R —> (m',a, P!)emoP, | (m,a,Q!)em'>O | m">P". 


Let I be such that ['(m) = w:¢, T(m’) = w':e and T(m") = w':c". 
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[Oo]. = “ec 
Pier a-Pit ies G = Tier ¢- a4 - wit | (Pile) | Lies “C. Uj :0j 
where ¢;’<!, up“</VY are fresh 
[liet.m Pile = “Cc. utr. nr UniCn, | Tier. nlPle: 
where c, u;’<!"" are fresh 
[(newa) P]. = (newa) ([P]-) 
U{Vnem,---,(mnembuM,T) = ure.vpey. +++ . Uni? 


| U({m} w M,T) 
where ['(m) = u:é 
and [({t)n em) = ui:G 


U({(m', a, P) em, (m,G, Q) em’'} Ww M,T) (hese. ve” | [Pe 

| wc’. w:a* | [Q]e 

| U({m,m’} w M,T) 

where ['(m) = u:¢ 

and ['(m!’) = u':c 

and I'((m’,a, P) em) = u:q 
and w fresh 


Figure 7: The functions [P]- and U(M,T) 
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Then 
— WiC. 7A. UR:CK [Pelee | Tierungn3lPile 
| w':c | HieruylQile 
| wie” | [P"Jer | U({m,m’,m"},T) 
— WwW: “A. UR:Ck [Peleg | Hiern LPille 
| 
| 


Cc 
wie. UF | Tiers lQide 
wc! | [P" i | U({m, m',m"},T) 


—> WIC. 7A. URICk [Prlle, | Hierungny [Pile 
v4, | wid un. | TTierunyl@ide 
wc! | [P* cl! | U({m,m',m"},T) 


— wie.u,:a. unc | [Pale | Hiern gry l Pile 
wid. VU}: tal | Hierus ny 19: 
wc! | [Pen | Um, m',m"},T) 


— unite | [Pelo, | wie. uyia. ugite” 
Hiern} lPille 

wie vd.” | Meru Qe 
wc | [P"Jen | U({m,m',m"},T) 


[Ry 


where IY = I'[(m’,a, P’) em +> ug:ce 3 (m,@,Q’) em! + vi:c;]. Since our 
structures and asynchronous RCCS are both reversible, the above computation 
also demonstrates that R’ —> R implies [RJ —>* [RI]. a 


It is worth observing that, in the above encoding, there is a strict 
correspondence between reversibility of our structures and reversibility in 
asynchronous RCCS. To formalize this correspondence, let an occurrence of 
an id u be positive in a structure S$ if u occurs in a signal or ina gate AL.B.7C 
or A+.7*B.C in the A+ sequence or in the C sequence. The occurrence of 
u is negative if it is in the B sequence of a gate At.B.~C. Let the type of 
g, written type(g), be the sequence of ids of co-names in g. For example 
type(v:a. “a. u:@.w:e) = uw (as usual, dots are omitted in sequences of 
ids). 


Definition 3 A weak coherent structure S 1s coherent whenever 
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— different gates in S have types with no id in common; 


— ids occur at most twice: one occurrence is positive and the other is 
negative. 


Proposition 5 1. IfS is coherent and S —+ S$! then S! is coherent. 


2. [TLepmi> PJ’, where T is an environment as discussed above and P; 
are guarded choices, is a coherent structure. 


The coherence of [[],<;mi > Pi]' and the properties of T have an 
immediate consequence: the terms that are desynchronized are exactly those 
implementing processes that actually interacted in the past. Said more 
technically, in the proof of Lemma 1, there is a unique way to desynchronize 
the process [Px]«, in the solution [R’]" ‘, namely undoing exactly the steps 
until the solution [R]'. No other signal/gate may interfere with these steps. 

We finally observe that the reverse encoding of [-]' seems impossible (or 
at least we do not have any solution at present) because reversible structures 
are not asynchronous. In facts, sequences of outputs in our gates are not 
encodable in asynchronous RCCS. 


7 Conclusion 


In this paper we have introduced reversible structures, an algebra for massive 
concurrent systems, where terms retain bits of causal dependencies that 
allow one to reverse computation histories. We have discussed the model 
that has inspired reversible structures, the DNA three-domains strands — and 
studied the implementation of (weak coherent) reversible structures in DNA 
strands. We have finally analyzed significant synchronization patterns of 
process algebra and the modeling schemas into reversible structures. 

In the companion paper [4] we develop the theory of causal dependencies 
in reversible structures. Following Lévy [11], we define an equivalence on 
computations based on labels of terms that abstracts away from the order 
of causally independent reductions — the permutation equivalence. We 
then demonstrate a standardization theorem that permits shortening of 
computations by removing converse reductions. This theorem seems strange 
because, in reversible structures, removals may address reverse reductions 
performed by different terms (of same species). In facts, in our setting, 
labels are not powerful enough to discriminate among molecules of the same 


Reversibility in Massive Concurrent Systems 197 


species. We finally study coherent reversible structures that have been 
defined in Section 6. We demonstrate that the reachability problem in 
coherent structures has a computational complexity that is quadratic with 
respect to the size of the structures, a problem that is EXPSPACE-complete 
in weak coherent structures. 

Our study prompts a thorough analysis of reversible calculi where 
processes have multiplicities and the causal dependencies between copies may 
be exchanged. Open questions are (i) What synchronization schemas can be 
programmed in massive concurrent systems? (ii) Are there other constraints, 
different than coherence, such that relevant bio-chemical properties retain 
better algorithms than in standard structures? (iii) What is the theory 
of massive (reversible) systems with irreversible operators and what is the 
relationship with standard programming languages? 
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